本文提出了一个层次结构框架,用于计划和控制涉及使用完全插入的多指机器人手的掌握变化的刚性对象的操纵。尽管该框架可以应用于一般的灵巧操作,但我们专注于对手持操作的更复杂的定义,在该目标下,目标姿势必须达到适合使用该对象作为工具的掌握。高级别的计划者确定对象轨迹以及掌握更改,即添加,卸下或滑动手指,由低级控制器执行。尽管基于学习的策略可以适应变化,但GRASP序列是在线计划的,但用于对象跟踪和接触力控制的轨迹规划师和低级控制器仅基于模型,以稳健地实现该计划。通过将有关问题的物理和低级控制器的知识注入GRASP规划师中,它将学会成功生成类似于基于模型的优化方法生成的grasps,从而消除了此类方法的高计算成本到该方法的高度计算成本到解释变化。通过在物理模拟中进行实验,以实现现实工具使用方案,我们将在不同的工具使用任务和灵活的手模型上展示了方法的成功。此外,我们表明,与基于模型的方法相比,这种混合方法为轨迹和任务变化提供了更大的鲁棒性。
translated by 谷歌翻译
The long-standing theory that a colour-naming system evolves under the dual pressure of efficient communication and perceptual mechanism is supported by more and more linguistic studies including the analysis of four decades' diachronic data from the Nafaanra language. This inspires us to explore whether artificial intelligence could evolve and discover a similar colour-naming system via optimising the communication efficiency represented by high-level recognition performance. Here, we propose a novel colour quantisation transformer, CQFormer, that quantises colour space while maintaining the accuracy of machine recognition on the quantised images. Given an RGB image, Annotation Branch maps it into an index map before generating the quantised image with a colour palette, meanwhile the Palette Branch utilises a key-point detection way to find proper colours in palette among whole colour space. By interacting with colour annotation, CQFormer is able to balance both the machine vision accuracy and colour perceptual structure such as distinct and stable colour distribution for discovered colour system. Very interestingly, we even observe the consistent evolution pattern between our artificial colour system and basic colour terms across human languages. Besides, our colour quantisation method also offers an efficient quantisation method that effectively compresses the image storage while maintaining a high performance in high-level recognition tasks such as classification and detection. Extensive experiments demonstrate the superior performance of our method with extremely low bit-rate colours. We will release the source code soon.
translated by 谷歌翻译
描述了一种用于分析摄像机陷阱延时记录的新的开源图像处理管道。该管道包括机器学习模型,以帮助人类的视频细分和动物重新识别。我们在为期一年的项目中使用了该管道的实用性的一些绩效结果和观察结果,研究了Gopher Tortoise的空间生态和社会行为。
translated by 谷歌翻译
当我们配对输入$ x $和输出$ y $的培训数据时,普通监督学习很有用。但是,这种配对数据在实践中可能很难收集。在本文中,我们考虑了当我们没有配对数据时预测$ y $的任务,但是我们有两个单独的独立数据集,分别为$ x $,每个$ $ $ y $ y $ y $ y $ y $ y $ u $ u $ u $ $,也就是说,我们有两个数据集$ s_x = \ {(x_i,u_i)\} $和$ s_y = \ {(u'_j,y'_jj)\} $。一种天真的方法是使用$ s_x $从$ x $中预测$ u $,然后使用$ s_y $从$ u $ $ y $预测$ y $,但我们表明这在统计上不一致。此外,预测$ u $比预测$ y $在实践中更困难,例如$ u $具有更高的维度。为了避免难度,我们提出了一种避免预测$ u $的新方法,但直接通过培训$ f(x)$ $ s_ {x} $来预测$ y = f(x)$,以预测$ h(u)$经过$ s_ {y} $的培训,以近似$ y $。我们证明了我们方法的统计一致性和误差范围,并通过实验确认其实际实用性。
translated by 谷歌翻译